Efficient Algorithms for Descendent Subtrees Comparison of Phylogenetic Trees with Applications to Co-evolutionary Classifications in Bacterial Genome
نویسندگان
چکیده
A phylogenetic tree is a rooted tree with unbounded degree such that each leaf node is uniquely labelled from 1 to n. The descendent subtree of of a phylogenetic tree T is the subtree composed by all edges and nodes of T descending from a vertex. Given a set of phylogenetic trees, we present linear time algorithms for finding all leaf-agree descendent subtrees as well as all isomorphic descendent subtrees. The normalized cluster distance, d(A, B), of two sets is defined by d(A, B) = ∆(A, B)/(|A| + |B|), where ∆(A, B) denotes the symmetric set difference of two sets. We show that computing all pairs normalized cluster distances between descendent subtrees of two phylogenetic trees can be done in O(n) time. Since the total size of the outputs will be Θ(n), the algorithm is thus computationally optimal. A nearest subtree of a subset of leaves is such a descendent subtree that has the smallest normalized cluster distance to these leaves. Here we show that finding nearest subtrees for a collection of pairwise disjointed subsets of leaves can be done in O(n) time. Several applications of these algorithms in areas of bioinformatics is considered. Among them, we discuss the 2CS (Two component systems) functional analysis and classifications on bacterial genome.
منابع مشابه
Quantitative Comparison of Tree Pairs Resulted from Gene and Protein Phylogenetic Trees for Sulfite Reductase Flavoprotein Alpha-Component and 5S rRNA and Taxonomic Trees in Selected Bacterial Species
Introduction: FAD is the cofactor of FAD-FR protein family. Sulfite reductase flavoprotein alpha-component is one of the main enzymes of this family. Based on applications of this enzyme in biotechnology and industry, it was chosen as the subject of evolutionary studies in 19 specific species. Method: Gene and protein sequences of sulfite reductase flavoprotein alpha-component, 5S rRNA sequence...
متن کاملQuantitative Comparison of Tree Pairs Resulted from Gene and Protein Phylogenetic Trees for Sulfite Reductase Flavoprotein Alpha-Component and 5S rRNA and Taxonomic Trees in Selected Bacterial Species
Introduction: FAD is the cofactor of FAD-FR protein family. Sulfite reductase flavoprotein alpha-component is one of the main enzymes of this family. Based on applications of this enzyme in biotechnology and industry, it was chosen as the subject of evolutionary studies in 19 specific species. Method: Gene and protein sequences of sulfite reductase flavoprotein alpha-component, 5S rRNA sequence...
متن کاملEfficient Data Mining with Evolutionary Algorithms for Cloud Computing Application
With the rapid development of the internet, the amount of information and data which are produced, are extremely massive. Hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. Data mining can overcome this problem. While data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. As the speed of ...
متن کاملComparison of Phylogenetic and Evolutionary of Nucleotide Squences of HVR1 region of Mitochondria genom in Goats and Other Livestock Species
Maintaining genomic diversity in goat populations in different parts of Iran is essential for breeding programs, increasing production, survival, resistance to diseases, and various environmental changing conditions. The aim of the present study was to determine the sequence of HVR1 from the mitochondrial genome of Iranian native goats including Sistani, Pakistani, Black and Lorry ecotypes...
متن کاملDetermining Difference in Evolutionary Variation of Bacterial RecA proteins vs 16SrRNA Genes by using 16s_Toxonomy Tree
Background and Aims: The rate of variation in various genes of a bacterial species is different during evolution. Therefore, in systematic bacterial studies many researchers compare the phylogenetic tree of a particular gene to the standard tree of an rRNA gene. Regarding the importance of 16SrRNA gene and the evolutional process of RecA protein family, we investigated the changes in the select...
متن کامل